Title: Optimistic Reinforcement Learning: Computational and Neural Bases Running Title: Optimistic Reinforcement Learning
نویسندگان
چکیده
Affiliations: Laboratoire de Neurosciences Cognitives (LNC), INSERM U960, Ecole Normale Supérieure, 75005, Paris, France. Laboratoire d'Économie Mathématique et de Microéconomie Appliquée (LEMMA), Université PanthéonAssas, 75006, Paris, France. Amsterdam Brain and Cognition (ABC), Nieuwe Achtergracht 129, 1018 WS Amsterdam, the Netherlands. Amsterdam School of Economics (ASE), Faculty of Economics and Business (FEB), Roetersstraat 11, 1018 WB Amsterdam, the Netherlands. Cognitive Neuroimaging Unit, CEA, INSERM, Université Paris-Sud, Université, Paris-Saclay, NeuroSpin center, 91191 Gif/Yvette, France. Institut Jean-Nicod (IJN), CNRS UMR 8129; Ecole Normale Supérieure, 75005, Paris, France. Institute of Cognitive Neurosciences (ICN), University College London, WC1N 3AR London, UK.
منابع مشابه
On Optimistic versus Randomized Exploration in Reinforcement Learning
We discuss the relative merits of optimistic and randomized approaches to exploration in reinforcement learning. Optimistic approaches presented in the literature apply an optimistic boost to the value estimate at each state-action pair and select actions that are greedy with respect to the resulting optimistic value function. Randomized approaches sample from among statistically plausible valu...
متن کاملDomain-Independent Optimistic Initialization for Reinforcement Learning
In Reinforcement Learning (RL), it is common to use optimistic initialization of value functions to encourage exploration. However, such an approach generally depends on the domain, viz., the scale of the rewards must be known, and the feature representation must have a constant norm. We present a simple approach that performs optimistic initialization with less dependence on the domain.
متن کاملAdaptive Reactive Job-shop Scheduling with Reinforcement Learning Agents
Traditional approaches to solving job-shop scheduling problems assume full knowledge of the problem and search for a centralized solution for a single problem instance. Finding optimal solutions, however, requires an enormous computational effort, which becomes critical for large problem instance sizes and, in particular, in situations where frequent changes in the environment occur. In this ar...
متن کاملOptimistic Simulated Exploration as an Incentive for Real Exploration
Many reinforcement learning exploration techniques are overly optimistic and try to explore every state. Such exploration is impossible in environments with the unlimited number of states. I propose to use simulated exploration with an optimistic model to discover promising paths for real exploration. This reduces the needs for the real exploration.
متن کاملReinforcement Learning using Optimistic Process Filtered Models
An important problem in reinforcement learning is determining how to act while learning sometimes referred to as the exploration-exploitation dilemma or the problem of optimal learning. The problem is intractable, usually solved through approximation such as by being optimistic in the face of uncertainty. In environments with inherent determinism, arising for example from known process template...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016